feat(telemetry): unify span creation paths for hierarchical trace tree#4126
feat(telemetry): unify span creation paths for hierarchical trace tree#4126doudouOUC wants to merge 13 commits into
Conversation
There was a problem hiding this comment.
Pull request overview
This PR reworks telemetry span creation so LLM and tool spans use session-tracing helpers and can form a hierarchical trace tree under interaction spans.
Changes:
- Adds tool-span AsyncLocalStorage scoping and updates session-tracing tests/exports.
- Replaces generic API/tool tracing wrappers with typed LLM/tool span helpers.
- Adds a workflow tracing gap analysis design document.
Reviewed changes
Copilot reviewed 9 out of 9 changed files in this pull request and generated 2 comments.
Show a summary per file
| File | Description |
|---|---|
packages/core/src/telemetry/session-tracing.ts |
Adds tool context scoping and updates tool execution span parenting/status behavior. |
packages/core/src/telemetry/session-tracing.test.ts |
Updates and expands tests for tool context and span status behavior. |
packages/core/src/telemetry/index.ts |
Exports the new tool context helper. |
packages/core/src/core/loggingContentGenerator/loggingContentGenerator.ts |
Switches API tracing to LLM request span helpers. |
packages/core/src/core/loggingContentGenerator/loggingContentGenerator.test.ts |
Updates logging generator telemetry mocks/assertions for LLM spans. |
packages/core/src/core/coreToolScheduler.ts |
Switches tool tracing to typed tool/tool-execution spans. |
packages/core/src/core/client.ts |
Removes the redundant client-level generateContent span wrapper. |
packages/core/src/core/client.test.ts |
Removes assertions tied to the deleted client span. |
docs/design/workflow-tracing-gaps.md |
Documents current tracing gaps and phased roadmap. |
Comments suppressed due to low confidence (3)
packages/core/src/core/loggingContentGenerator/loggingContentGenerator.ts:467
- This finalizer now relies on
endLLMRequestSpan, which can throw from span operations. In the streaming path that means a successfully consumed stream can still fail duringfinally(the existing “preserves stream success when the OK status update fails” test covers this behavior). Keep span finalization best-effort so telemetry errors do not surface to callers or mask stream errors.
if (span) {
endLLMRequestSpan(span, {
success: !errorOccurred,
inputTokens: lastUsageMetadata?.promptTokenCount,
outputTokens: lastUsageMetadata?.candidatesTokenCount,
durationMs: Date.now() - startTime,
error: errorOccurred
? API_CALL_FAILED_SPAN_STATUS_MESSAGE
: undefined,
});
packages/core/src/core/coreToolScheduler.ts:1998
endToolExecutionSpancan throw from OTel span operations, and because this call is inside the tool executiontry, a telemetry failure after a successful tool result is caught asexecutionErrorand turns the tool call into an error. Make the execution-span end best-effort (or move it outside the tool-result error handling with its own guard) so tracing failures do not change tool outcomes.
const toolResult: ToolResult = await promise;
endToolExecutionSpan(execSpan, {
success: toolResult.error === undefined,
});
packages/core/src/core/loggingContentGenerator/loggingContentGenerator.ts:267
- On the error path,
endLLMRequestSpancan throw while setting span status/ending the span, which would replace the original upstream error before it is rethrown. The previoussafeSetStatuspath explicitly avoided masking API errors. Guard this telemetry finalization so callers still receive the original generation failure.
endLLMRequestSpan(llmSpan, {
success: false,
durationMs,
error: API_CALL_FAILED_SPAN_STATUS_MESSAGE,
});
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
373b64f to
609e32b
Compare
Code Coverage Summary
CLI Package - Full Text ReportCore Package - Full Text ReportFor detailed HTML reports, please see the 'coverage-reports-22.x-ubuntu-latest' artifact from the main CI run. |
wenshao
left a comment
There was a problem hiding this comment.
[Critical] [test] coreToolScheduler.test.ts: 13 telemetry span tests fail — The test mock intercepts withSpan from tracer.js, but production code now imports startToolSpan/endToolSpan/runInToolSpanContext from session-tracing.ts. toolSpanRecords stays empty; all span assertions fail. The refactored span lifecycle in executeSingleToolCall and _executeToolCallBody has zero test coverage.
[Critical] [test] loggingContentGenerator.test.ts: 3 tests fail — (1) Attribute name mismatch: test expects unprefixed model/prompt_id but mock uses prefixed qwen-code.model/qwen-code.prompt_id. (2) Mock endLLMRequestSpan calls span.setStatus() without try/catch; errors propagate from finally block. Production endLLMRequestSpan wraps everything in try/catch.
Suggested fixes for test failures:
- Update coreToolScheduler test mock to intercept
startToolSpan/endToolSpan/runInToolSpanContext/startToolExecutionSpan/endToolExecutionSpanfrom'../telemetry/session-tracing.js' - Fix attribute name assertions to use prefixed names
- Wrap the mocked
endLLMRequestSpanbody in try/catch matching production behavior
wenshao
left a comment
There was a problem hiding this comment.
[Suggestion] endToolSpan and endLLMRequestSpan have asymmetric success attribute handling: endLLMRequestSpan unconditionally sets success on every call, while endToolSpan only sets it on success paths (no metadata → no success attribute). This means dashboard queries filtering on success = false will miss all non-success tool spans. Consider making them consistent — either always set success in endToolSpan too, or add { success: false } to error/cancel paths.
— DeepSeek/deepseek-v4-pro via Qwen Code /review
3919ade to
debe7f4
Compare
wenshao
left a comment
There was a problem hiding this comment.
Incremental review of new commits (since last review at 3919ade)
What changed
- IDE context is now merged into the user prompt via
<system-reminder>wrapper (no more separateaddHistorycalls) escapeSystemReminderTagsnow used for both rule content and IDE text (replaces narrow</system-reminder>regex)forceFullIdeContextset on Error and ChatCompressed events- State update deferral to first stream event, with tests covering throw-before-event case
- Microcompact log format updated for per-kind counts
Assessment
The incremental changes are well-structured and well-tested. The IDE context merge + system-reminder escaping is a clear improvement. Key concerns from the previous review about:
- Tool span naming regression
- Redundant
setStatus(OK) - Stream vs non-stream span naming
- Unprotected code before
tryblock in_executeToolCallBody
...are in files not touched by this incremental update and remain open.
— DeepSeek/deepseek-v4-pro via Qwen Code /review
6ef1929 to
81a8869
Compare
#3731 P3 Phase 1) Replace disconnected withSpan/startSpanWithContext calls in runtime with session-tracing typed helpers so LLM and tool spans become children of the interaction span instead of siblings under the session root. - Add toolContext ALS with runInToolSpanContext() for concurrent-safe tool span scoping (uses AsyncLocalStorage.run, not enterWith) - Wire startLLMRequestSpan/endLLMRequestSpan in loggingContentGenerator for both streaming and non-streaming paths - Wire startToolSpan/endToolSpan + startToolExecutionSpan/endToolExecutionSpan in coreToolScheduler with proper try/finally lifecycle - Remove redundant withSpan('client.generateContent') wrapper from client.ts - Fix endToolSpan to not override pre-set status when metadata is omitted - Change startToolExecutionSpan to read parent from toolContext ALS - Update tests for new span creation APIs and remove dead test infrastructure
- Remove unused _toolSpan variable (TS6133) - Use bracket notation for index signature property access (TS4111)
…test mocks - coreToolScheduler.test.ts: mock startToolSpan/endToolSpan/runInToolSpanContext instead of withSpan; update cancellation tests for restored safeSetStatus call - loggingContentGenerator.test.ts: fix attribute keys in mock, add try/catch in endLLMRequestSpan mock to match production best-effort behavior
- Add debugLogger.warn in catch blocks of endLLMRequestSpan/endToolSpan/ endToolExecutionSpan instead of silent swallowing - Add JSDoc on endToolSpan documenting intentional no-metadata-no-status contract with setToolSpanFailure/setToolSpanCancelled - Add warning in startToolExecutionSpan when called outside runInToolSpanContext (no active toolContext) - Sanitize error message in endToolExecutionSpan: use constant TOOL_SPAN_STATUS_TOOL_EXCEPTION instead of raw error message
…heduler tests The full mock shadowed all re-exports (logToolCall, etc.) causing 49 test failures. Use importActual to preserve other exports, only override span functions.
startToolExecutionSpan mock also pushes to toolSpanRecords, so at(-1) returns the execution sub-span instead of the tool span. Use findLast to filter by name.
- Remove redundant safeSetStatus(span, OK) on success path — endToolSpan in finally already sets OK via metadata - Add llm_request.stream attribute (true/false) to distinguish streaming vs non-streaming LLM requests in trace backends
Bypass span.setStatus() in mock to avoid potential interference from vitest module resolution. Write to statusCalls/ended directly on the ToolSpanRecord.
…/index.js Mocking the barrel re-export (telemetry/index.js) with importActual was unreliable — vitest's module resolution could bind production code to the real endToolSpan before the mock override took effect. Mock the source module (session-tracing.js) directly to guarantee interception.
…y in finally Root cause: checkAndNotifyCompletion clears this.toolCalls before the finally block in executeSingleToolCall runs, so the tc lookup always returns undefined. Fix: set OK status explicitly in _executeToolCallBody's success path via safeSetStatus(span, OK), and call endToolSpan() without metadata in finally (just ends the span, preserves pre-set status from any path).
81a8869 to
eef8ce7
Compare
…n on failure - Wrap non-stream generateContent API call + logging in context.with(spanContext) so nested OTel spans (HTTP instrumentation, log-bridge spans) parent to qwen-code.llm_request instead of session root (matches streaming path). - runInToolSpanContext now also activates OTel context via otelContext.with, not just the custom toolContext ALS. Hooks/HTTP/IO during tool execution now correctly parent to qwen-code.tool span. - Split end*Span helpers: span.end() runs in its own try/catch so a throwing setAttributes/setStatus can't leak unended spans.
…cution span timing
- start{LLMRequest,Tool,ToolExecution}Span now fall back to getSessionContext()
when no parent context, instead of otelContext.active(). Side-query LLM calls
(auto-title, recap) now stay in the session trace instead of starting a new
detached trace.
- Move startToolExecutionSpan() to BEFORE invocation.execute(), matching
claude-code. Previously the synchronous setup inside execute (shell command
preprocessing, child_process.spawn) ran outside the execution span.
| ); | ||
| this.notifyToolCallsUpdate(); | ||
| }; | ||
| promise = invocation.execute( |
There was a problem hiding this comment.
[Critical] invocation.execute(...) now runs before the try block, so synchronous setup failures bypass the failure path. This PR intentionally starts the execution sub-span before execute() to bracket synchronous setup, but if execute() throws before returning a promise (for example during shell setup), the catch block below never runs: the execution span is not ended with failure metadata, failure hooks are skipped, and the tool call can miss the normal terminal error response. Move the try so it wraps both the execute() call and the await promise, and keep the existing catch/failure handling shared for synchronous and async failures.
— gpt-5.5 via Qwen Code /review
| } catch { | ||
| // OTel errors must not interrupt the consumer. | ||
| } | ||
| endLLMRequestSpan(span, { |
There was a problem hiding this comment.
[Suggestion] After the idle timer fires, the helper has already ended and removed the span, but this wrapper does not keep a local spanEnded/timed-out flag. If the consumer resumes after a long idle gap, later chunks still update lastUsageMetadata, but the final endLLMRequestSpan(...) call is a no-op, so the exported span remains an error/timed-out span and loses the real terminal status/token counts. The previous code kept a local spanEnded guard for this timeout path. Consider restoring a local guard/timeout state so resumed streams do not silently discard their final telemetry, or only end once the generator is truly abandoned.
— gpt-5.5 via Qwen Code /review
| // active via runInToolSpanContext, and tool implementations don't | ||
| // currently emit nested OTel spans of their own — the span boundary | ||
| // is purely for timing/attribution. | ||
| const execSpan = startToolExecutionSpan(); |
There was a problem hiding this comment.
[Suggestion] The scheduler tests mock startToolExecutionSpan / endToolExecutionSpan, but they do not assert that this new execution sub-span is started and ended with the expected metadata on the concrete scheduler paths. A regression could drop the sub-span or mark ToolResult.error / thrown exceptions as successful without failing these tests. Please add scheduler-level assertions for success, ToolResult.error, thrown invocation exceptions, and pre-hook denial not starting an execution span.
— gpt-5.5 via Qwen Code /review
| this.wrapped.generateContent(req, userPromptId), | ||
| const llmSpan = startLLMRequestSpan(req.model, userPromptId); | ||
| try { | ||
| llmSpan.setAttribute('llm_request.stream', false); |
There was a problem hiding this comment.
[Suggestion] This PR adds llm_request.stream to preserve stream vs non-stream observability after both paths moved to the shared qwen-code.llm_request span name, but the tests only check span creation/status and do not assert this attribute. Please have the span mock record setAttribute calls and assert generateContent() sets llm_request.stream to false while generateContentStream() sets it to true.
— gpt-5.5 via Qwen Code /review
| }); | ||
| endLLMRequestSpan(llmSpan, { | ||
| success: true, | ||
| inputTokens: response.usageMetadata?.promptTokenCount, |
There was a problem hiding this comment.
[Suggestion] The new LLM span ending path forwards token counts, duration, and sanitized error metadata into endLLMRequestSpan, but the test mock discards the metadata and existing tests only observe status/ended state. That leaves token attribution regressions untested for both non-stream responses and stream final lastUsageMetadata. Please capture the endLLMRequestSpan metadata in tests and assert non-stream success, stream success, and error paths forward the expected token/error values.
— gpt-5.5 via Qwen Code /review
| }, | ||
| ), | ||
| runInToolSpanContext: vi.fn(<T>(_span: unknown, fn: () => T): T => fn()), | ||
| startToolExecutionSpan: vi.fn(() => createMockToolSpan('tool.execution', {})), |
There was a problem hiding this comment.
[Critical] startToolExecutionSpan/endToolExecutionSpan lifecycle is completely untested in the scheduler tests.
The mock defines startToolExecutionSpan: vi.fn(() => createMockToolSpan('tool.execution', {})) and endToolExecutionSpan: vi.fn(...) but no test ever asserts on them. getLastToolSpan() explicitly excludes tool.execution records (r.name !== 'tool.execution'), making the execution sub-span invisible to the test infrastructure.
If a future refactor moves or removes these calls in _executeToolCallBody, all tool calls would lose execution timing attribution in tracing backends — but the test suite would not detect it.
| startToolExecutionSpan: vi.fn(() => createMockToolSpan('tool.execution', {})), | |
| // In runSingleTool, add executionSpanRecord to the return value and assert: | |
| // 1. expect(startToolExecutionSpan).toHaveBeenCalled() | |
| // 2. For success path: execSpan ended with success not false | |
| // 3. For exception path: execSpan ended with { success: false, error: '...' } |
— DeepSeek/deepseek-v4-pro via Qwen Code /review
|
|
||
| const duration = metadata?.durationMs ?? Date.now() - spanCtx.startTime; | ||
| const endAttributes: Attributes = { duration_ms: duration }; | ||
| try { |
There was a problem hiding this comment.
[Suggestion] The three end*Span functions (endLLMRequestSpan, endToolSpan, endToolExecutionSpan) share nearly identical ~30-line structures — lookup spanCtx, check ended, try/catch for attributes+status, separate try/catch for span.end(), cleanup from maps. Only metadata handling differs slightly.
Consider extracting a private helper to reduce maintenance burden and drift risk across the three copies.
| try { | |
| function _endSpanCommon( | |
| span: Span, | |
| metadata: Record<string, unknown> | undefined, | |
| resolveStatus: (meta) => { code: SpanStatusCode; message?: string } | undefined, | |
| ): void { | |
| // shared lookup, try/catch, end(), cleanup | |
| } |
— DeepSeek/deepseek-v4-pro via Qwen Code /review
| prompt_id: promptId, | ||
| }), | ||
| ), | ||
| endLLMRequestSpan: vi.fn( |
There was a problem hiding this comment.
[Suggestion] endLLMRequestSpan call parameters (inputTokens/outputTokens/durationMs) from loggingStreamWrapper's finally block are not verified by tests.
The test mock is a pass-through that only tracks success/error but ignores token counts and duration. If usageMetadata extraction logic changes, token counts would silently drop from spans.
| endLLMRequestSpan: vi.fn( | |
| // Enhance the mock to capture call parameters and assert: | |
| // expect(endLLMRequestSpan).toHaveBeenCalledWith( | |
| // expect.anything(), | |
| // expect.objectContaining({ inputTokens: ..., outputTokens: ..., durationMs: ... }) | |
| // ); |
— DeepSeek/deepseek-v4-pro via Qwen Code /review
| endToolSpan(span); | ||
|
|
||
| expect(mockSpans[0]!.statuses[0]!.code).toBe(SpanStatusCode.OK); | ||
| expect(mockSpans[0]!.statuses).toHaveLength(0); |
There was a problem hiding this comment.
[Suggestion] The try/catch resilience paths in end*Span functions — where a failing setAttributes/setStatus does not prevent span.end() — are untested.
The test mock never throws from setAttributes or setStatus. If this resilience logic is broken in a future refactor, failed attribute updates could skip span.end(), causing spans to leak in activeSpans until the 30-minute TTL cleanup.
| expect(mockSpans[0]!.statuses).toHaveLength(0); | |
| // Add a test with a throwing mock: | |
| const throwingSpan = createMockSpan('test', { throwOnSetStatus: true }); | |
| endLLMRequestSpan(throwingSpan, { success: true }); | |
| expect(throwingSpan.end).toHaveBeenCalled(); // span.end() still runs |
— DeepSeek/deepseek-v4-pro via Qwen Code /review
| TOOL_FAILURE_KIND_PRE_HOOK_BLOCKED, | ||
| TOOL_SPAN_STATUS_PRE_HOOK_BLOCKED, | ||
| TOOL_FAILURE_KIND_POST_HOOK_STOPPED, | ||
| TOOL_SPAN_STATUS_POST_HOOK_STOPPED, |
There was a problem hiding this comment.
[Critical] Leaked execSpan on post-hook-stop path
The return at line 2076 exits _executeToolCallBody after calling setToolSpanFailure, but endToolExecutionSpan(execSpan, ...) is never called. The execSpan was started at line 1960 and is normally ended at line 1999 or in the catch block at line 2216, but the early return bypasses both. The sub-span leaks — it remains in activeSpans/strongSpans until the 30-minute TTL cleanup force-ends it.
| TOOL_SPAN_STATUS_POST_HOOK_STOPPED, | |
| setToolSpanFailure( | |
| span, | |
| TOOL_FAILURE_KIND_POST_HOOK_STOPPED, | |
| TOOL_SPAN_STATUS_POST_HOOK_STOPPED, | |
| ); | |
| endToolExecutionSpan(execSpan, { | |
| success: false, | |
| error: TOOL_SPAN_STATUS_TOOL_EXCEPTION, | |
| }); | |
| return; |
| const toolUseId = generateToolUseId(); | ||
| try { | ||
| const toolResult: ToolResult = await promise; | ||
| endToolExecutionSpan(execSpan, { |
There was a problem hiding this comment.
[Suggestion] endToolExecutionSpan called before post-processing, making catch-block call dead code
endToolExecutionSpan(execSpan, ...) is called immediately after await promise resolves, before post-processing (abort check, post-hook, skill activation). If post-processing subsequently throws, the catch block at line 2216 calls endToolExecutionSpan again, but the spanCtx.ended guard silently no-ops it — the execSpan retains success: true even when the overall tool call failed in post-processing. Consider moving the success-path endToolExecutionSpan call to after the post-processing block so the execSpan captures the full lifecycle.
| } | ||
|
|
||
| span.setAttributes(endAttributes); | ||
| span.setAttributes(endAttributes); |
There was a problem hiding this comment.
[Suggestion] endLLMRequestSpan uses the passed Span parameter for mutations while endToolSpan uses spanCtx.span
endLLMRequestSpan calls span.setAttributes(), span.setStatus(), span.end() on the passed parameter. endToolSpan correctly looks up spanCtx.span from activeSpans and uses that for all mutations. This inconsistency is a maintenance trap — a future developer following one pattern could introduce a bug in the other. Consider aligning endLLMRequestSpan to use spanCtx.span for mutations, matching endToolSpan.
…ce, test coverage - coreToolScheduler.executeSingleToolCall: move try-block to wrap invocation.execute() so synchronous throws (e.g. shell setup failure) flow into the same catch path as async rejections. Previously a sync throw would leak the execution span and skip failure hooks. - loggingStreamWrapper: track spanEndedByTimeout flag so a stream that resumes after the 5-min idle timeout does not run the final endLLMRequestSpan (which would no-op anyway, but the flag also stops resetSpanTimeout from queuing further timer callbacks). - coreToolScheduler.test: add execution sub-span assertions for success, ToolResult.error, thrown invocation exceptions, and pre-hook denial. - loggingContentGenerator.test: capture setAttribute calls into the mock span attributes record; assert llm_request.stream is false for non-stream and true for stream paths.
Summary
Replaces disconnected
withSpan/startSpanWithContextcalls in runtime with session-tracing typed helpers, fixing the trace tree so LLM and tool spans become children of the interaction span instead of siblings under the session root.Before (flat — all spans are siblings):
After (hierarchical — proper parent-child):
Key changes
session-tracing.ts: AddtoolContextALS withrunInToolSpanContext()for concurrent-safe tool span scoping (usesAsyncLocalStorage.run, notenterWith); changestartToolExecutionSpanto read parent fromtoolContext; fixendToolSpanto not override pre-set status when metadata is omitted (documented via JSDoc); adddebugLogger.warnin allend*Spancatch blocks for OTel error visibility; warn whenstartToolExecutionSpancalled outsiderunInToolSpanContextloggingContentGenerator.ts: WirestartLLMRequestSpan/endLLMRequestSpanfor both streaming and non-streaming paths, replacingwithSpan/startSpanWithContext; integrate upstreamspanEndTimeoutidle mechanism withendLLMRequestSpanidempotency; addllm_request.streamattribute to distinguish streaming vs non-streamingcoreToolScheduler.ts: WirestartToolSpan/endToolSpan+startToolExecutionSpan/endToolExecutionSpanwithtry/finallylifecycle; extract_executeToolCallBodyfor clean span scoping; sanitize error inendToolExecutionSpan(use constant instead of raw message); set OK status in success path viasafeSetStatusclient.ts: Remove redundantwithSpan('client.generateContent')wrapper (LLM span now created in loggingContentGenerator); preserve abort error propagationsession-tracing.jsdirectly for reliable interception; addtoolContextALS scoping testsdocs/design/workflow-tracing-gaps.mdwith full gap analysis and Phase 1-3 roadmapPart of #3731 (P3 Deeper observability — Phase 1)
Phase 1 focuses on unifying span creation paths. Phase 2 (blocked_on_user + hook spans) and Phase 3 (subagent trace tree) tracked in the parent issue.
Test plan
npx vitest run packages/core/src/telemetry/session-tracing.test.ts— 23 tests passnpx vitest run packages/core/src/core/coreToolScheduler.test.ts— 119 tests passnpx vitest run packages/core/src/core/loggingContentGenerator/loggingContentGenerator.test.ts— 32 tests passnpx vitest run packages/core/src/core/client.test.ts— 104 tests passnpx tsc --noEmit— zero new type errorsE2E trace verification
rm -f /tmp/spans-test.jsonl QWEN_TELEMETRY_ENABLED=1 \ QWEN_TELEMETRY_OUTFILE=/tmp/spans-test.jsonl \ node packages/cli/dist/index.js --prompt "what is 2+2" --max-session-turns 1Verify parent-child relationships:
Actual output:
[ {"name":"qwen-code.llm_request","spanId":"a11e955e775f","parentSpanId":"1a88a402e766","status":2,"stream":false,"tool":null}, {"name":"qwen-code.llm_request","spanId":"b52fc1e4267d","parentSpanId":"1a88a402e766","status":1,"stream":true,"tool":null}, {"name":"qwen-code.interaction","spanId":"1a88a402e766","parentSpanId":"ROOT","status":1,"stream":null,"tool":null}, {"name":"qwen-code.llm_request","spanId":"e46925657044","parentSpanId":"1a88a402e766","status":1,"stream":true,"tool":null}, {"name":"qwen-code.tool.execution","spanId":"31c2ee6fb1f5","parentSpanId":"4709050c8523","status":1,"stream":null,"tool":null}, {"name":"qwen-code.tool","spanId":"4709050c8523","parentSpanId":"1a88a402e766","status":1,"stream":null,"tool":"read_file"}, {"name":"qwen-code.tool.execution","spanId":"7df917fb02cd","parentSpanId":"0ccbfe610d63","status":1,"stream":null,"tool":null}, {"name":"qwen-code.tool","spanId":"0ccbfe610d63","parentSpanId":"1a88a402e766","status":1,"stream":null,"tool":"read_file"}, {"name":"qwen-code.llm_request","spanId":"4c3ae4665b69","parentSpanId":"1a88a402e766","status":1,"stream":true,"tool":null} ]Expected trace tree (confirmed):
🤖 Generated with Qwen Code